Number of /i-cycles in the Internet at the Autonomous System Level 
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We present here a study of the clustering and cycles present in the graph of Internet at the 
Autonomous Systems level. Even if the whole structure is changing with time, we present some 
evidence that the statistical distributions of cycles of order 3,4,5 remain stable during the evolution. 
This could suggest that cycles are among the characteristic motifs of the Internet. Furthermore, 
we compare data with the results obtained for growing network models aimed to reproduce the 
Internet evolution. Namely the fitness model, the Generalized Network Growth model and the 
Bosonic Network model. We are able to find some qualitative agreement with the experimental 
situation even if the actual number of cycles seems to be larger in the data than in any proposed 
growing network model. The task to capture this feature of the Internet represent one of the 
challenges in the future Internet modeling. 
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Internet is a beautiful example of a complex system 
with many degrees of freedom resulting in global scaling 
properties. It has been shown Q, that the Internet 
belongs to the wide class of scale-free networks 0. 0. [(J . 
Indeed, it can be described as a network, with nodes 
and links representing respectively Autonomous Systems 
(AS) and physical lines connecting them; moreover, its 
degree distribution follows a power-law behavior. 

Different topological quantities have also been mea- 
sured beside the degree distribution exponent. Among 
those, the clustering coefficient C(k) and the average 
nearest neighbor degree k nn (k) of a node as a function 
of its degree k 0, H, 13 • In particular, measurements in 
Internet yield C(k) ~ k~ - 75 and k nn ~ k~ v with 
v ~ 0.5 [13. A two- vertices degree anti-correlation has 
also been measured 0. Accordingly, Internet is said to 
display disassortitative mixing , because nodes prefer 
to be linked to peers with different rather than similar 
degree. Moreover, the modularity of the Internet due to 
the national patterns has been studied by measuring the 
slow decaying modes of a diffusion process defined on it 

m 

Recently, more attention has been devoted to network 
motifs [Tjjllalj i-e. subgraphs that recur with a higher 
frequency than in maximally random graphs with the 
same degree distribution. Among those, the most natu- 
ral class includes cycles 0, 0, closed paths of various 
lengths that visit each node only once. Cycles (or loops) 
are interesting because they account for the multiplicity 
of paths between any two nodes. Therefore, they encode 
the redundant information in the network structure. Fol- 
lowing the arguments of ^4(, it can be shown that the 
number Nh of cycle of size h, in a equilibrium undirected 
scale-free network of N nodes with a power-law degree 
distribution P(k) ~ fc~ 7 , is 



with 



1 for 7 < 2 
£(h) = { 3 - 7 for 2 < 7 < 3. 

for 7 > 3 



(2) 



In other words, Nh(N) is an algebraic function of the 
system size with an exponent £ independent of the length 
h of the cycle. 

In contrast, the only analytical result 0] for off- 
equilibrium, scale-free networks refers to the Barabasi- 
Albert model 18], and reads 



N h (N) ~ (jlog(N) 



(3) 



with ip(h) = h. 

To measure the actual scaling in Internet at the AS 
level, we considered its symmetrical adjacency matrix 
{dij}, with ciij = 1 if i and j are connected and a 





otherwise. We assume that no self-loop is present, i.e. 
an = for all i. In this case, for h — 3 we simply have 
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For h = 4 and ft, = 5, by simple arguments it is possible 
to show that 



A^ 4 = - 
8 

and that 
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N h (N) - N*W 



(1) 



(6) 

The data of the Internet at the Autonomous System 
level are collected by the University of Oregon Route 
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FIG. 1: Number of /i-loops Nh as a function of the system 
size N for loops of length 3,4,5. 



Views Project and made available by the NLANR (Na- 
tional Laboratory of Applied Network Research). The 
subset we used in this manuscript are mirrored at CO SIN 
web page http:/ /www. cosin.org. We considered 13 snap- 
shots of the Internet network at the AS level at different 
times starting from November 1997 (when N = 3015) 
toward January 2001 (N = 9048). Throughout this pe- 
riod, the degree distribution is a power-law with a nearly 
constant exponent 7 ~ 2.22(1). Using relations (0J, J5J, 
(JBJ, we measure Nh(t) for h = 3,4, 5 in the Internet at 
different times, corresponding to different network size. 
We observe in figure ^ thai the data follow a scaling of 
the type l(TJl. as predicted by [l4| for maximally random 
(equilibrium) scale-free networks. Unfortunately, the ex- 
ponents £,(h) strongly depend on h, as reports tabled arid 
significantly exceed the predicted value (Eq.J5J) for equi- 
librium scale- free networks with same 7, that is, £ = 0.78. 

So, we can state that loops up to size 5 are much more 
frequent in Internet than in a random scale- free networks 
with a similar degree distribution. and Nh are large 
even when compared with off-equilibrium networks in- 
spired by the Internet growth. The models we consider 
here reproduce the most accurately the Internet behavior 
as regards the degree, clustering and centrality probabil- 
ity distributions. 

The fitness model 19j, for example, is a growing net- 
work model where, at each time step, a new node is added 
to the network and connected by m links to existing ones. 
Each node has a fitness r)i, randomly drawn from a uni- 
form distribution in [0, 1], which enters into the proba- 
bility that a node acquires a new link, 



IL 



(7) 



acquisition of new links The resulting network is a scale- 
free one with 7 = 2.255. It has also been found 0, 13 
that C(k) and k nn (k) are in qualitative agreement with 
Internet data. 

As a second instance, we compare the Internet data 
to the recently proposed Generalized Network Growth 
Model (GNG) |22j. According to the its definition, at 
each time step 

1. either a node is added and linked with vertex i with 
probability 



P 



2. or a link is added (if absent) between nodes i and 
j already present, with probability 



(9) 



The resulting network is a scale-free one, with "f(p) = 
2 + gz - - Besides, it displays the non trivial features of 
the degree correlations as measured in Internet. 

Finally, we considered the Bosonic Network (BN), 
where each node i is assigned an innate quality in the 
spirit of Ref.|20l|. represented by a random 'energy' ti 
drawn from the probability distribution p(ei). The at- 
tractiveness of each node i is then determined jointly by 
its connectivity ki and its energy e^. Namely, the prob- 
ability that node i acquires a link at time t is given by 



IT 



(10) 



i.e. low energy, high degree nodes are more likely to ac- 
quire new links. The parameter (3 = 1/T in FI^ tunes the 
relevance of the quality with respect to the degree in the 
acquisition probability of new links. Indeed, for T — > 00 
the probability IU does not depend any more on the en- 
ergy €i and the BN model reduces to the Barabasi- Albert 
(BA) model, based only on preferential attachment. 

On the other hand, in the limit T — > only the low- 
est energy node has non-zero probability to acquire new 
links. In Ref. |2l|. it has been shown that the connec- 
tivity distribution in this network model can be mapped 
into the occupation numbers of a Bose gas. Accordingly, 
one would expect a corresponding phase transition for 
the topology of the network at some temperature value 
T c . In fact, such a critical value is observed for energy 
distributions where (p(e) — > for e — > 0). For T > T c the 
system is in the "fit-get-rich" (FGR) phase, where low- 
energy nodes acquire links at a higher rate that high- 
energy ones, while for T < T c a "Bose-Einstein conden- 
sate" (BEC) or "winner-takes-all" phase emerges, where 
a single nodes grabs a finite fraction of all the links. We 
simulated this model assuming 



The fitness represents an intrinsic ability of a node in the 



p(e) = (9 + l)e e and e E (0, 1) 



(11) 
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FIG. 2: Number of /i-loops Nh with h = 3, 4, 5 in fitness 
model with m — 2 (graph(a)) and GNG model with p — 0.5 
(graph(b)) and p — 0.6 (graph(c)) of size up to N = 10 4 . The 
data asymptotically follow the scaling Q with exponents that 
remain well below the Internet data. 

where 8 = 0.5. Varying T, one observes a change in the 
behavior of Nh in the bosonic network from a scaling of 
the type J3J), shown to be exact in the j3 = limit for the 
BA network model ^(| , to a scaling of the type |QJ , valid 
in the low-temperature limit. In reference (16J, we claim 
that this change occurs right at the Bose-Einstein con- 
densation temperature T c . A careful analysis of the tran- 
sition shows in fact that the transition is rather smooth 
at T c . 

In order to compare networks with a similar mean de- 
gree (< k >= 3.5 for the Internet), we consider the fit- 
ness model with m = 2 (< k >= 2m — 4) and the GNG 
model with parameter p = 0.5 (< k >= 2/p = 4) and 
p = 0.6 (< k >= 2/p = 3.33). In the GNG network with 
p = 0.5,0.6 one numerically finds 7 = 2.5(2) |22| . 

In figure [3 we show the scaling of Nh as a function 
of the system size for the fitness model with m = 2 and 
the GNG model with p = 0.5, p = 0.6. For large N, 
Nh(N) is a power-law as in the real Internet, yet with 
much smaller exponents, as shown in Table [I] 



System 


€(3) 




C(5) 


AS 
Fitness 
GNG (p=0.5) 
GNG (p=0.6) 


1.45 ±0.07 
0.59 ±0.02 
0.53 ±0.03 
0.53 ±0.03 


2.07 ±0.01 
0.86 ±0.02 
0.72 ±0.03 
0.74 ±0.03 


2.45 ±0.01 
1.10 ±0.02 
0.96 ±0.02 
0.99 ±0.02 



TABLE I: The exponent £ (n) for n — 3, 4, 5 as defined in 
equation Q for real data and network models. 

When considering the bosonic network model, the pic- 
ture is more complicated. The loops number behavior 
depends strongly on the temperature parameter. 

We can distinguish a high-temperature phase, where 
N h (N) is better fitted by ©- FGR phase- and a low- 
temperature phase, where Nh(N) scales as HJ - BEC 
phase. Even when one decreases the temperature, £(h) 
remains always far from the real network exponents, as 
it is shown in figure |3 so that also the bosonic network 
fails in reproducing correctly such feature. Furthermore, 
no significant sign for a 'winner' node are found in the 




FIG. 3: The number of cycles in a bosonic network, for (a) 
/3 = 0.5 and (b) = 2.5. In the inset of (b) we plot the 
exponents £(h), for h = 3 (solid line) ,4 (dotted line), 5 (dashed 
line) as a function of the inverse temperature j3. 

Internet data in which the most connected node has a 
fraction of links k/N = 2024/9048 = 0.22 for the January 
2001 AS data. 

Following 0| > we a l so measured the clustering coeffi- 
cients C3.i and C4.i as a function of the connectivity fc, of 
node i for all z's. In particular, 03^ is the usual clustering 
coefficient C, i.e. the number of triangles including node 
i divided by the number of possible triangles ki(ki — \)/2. 

Similarly, o^j measures the number of quadrilaterals 
passing through node i divided by the number of possible 
quadrilaterals Zi. This last quantity is the sum of all 
possible primary quadrilaterals Zf (where all vertices are 
nearest neighbors of node 1) and all possible secondary 
quadrilaterals Z!* (where one of the vertices is a second 
neighbor of node i). If node i has fc" n second neighbors, 
Zf = ki(h - l)(ki - 2)/2 and Zf = kf n h(ki - l)/2. In 
Fig. 0] (a) we plot cz{k) 1 c±{k) for the Internet data at 
three different times (November 1997, January 1999 and 
January 2001) showing that the behavior of c^ik) and 
Ci(k) is invariant with time and scales as 

c h (k) ~ (12) 

with (5(3) = 0.7(1) and 5(4) = 1.1(1). 

In Fig. 01 we compare the behavior of C3(/c) and c±(k) 
in real Internet data and in the Internet models. Wo 
found a similar behavior in the three networks model 
and in the Internet with the cs(k) and Ci(k) of the In- 
ternet models scaling as (|12|) . Exponents, however, vary 
significantly, as shown in Table ITT1 

The fitness model reproduces the best the Internet 
clustering scaling pattern. Nevertheless, we observe that 
the number of triangles and quadrilaterals in real data 
is much larger than in the fitness network. Indeed, we 
have c 3 (10 3 ) ~ 10~ 2 and c 4 (10 3 ) - 10~ 4 in the AS net- 
work, while in the fitness model C3(10 3 ) ~ 10~ 3 and 
c 4 (10 3 ) - 10- 5 . 

In conclusion, we computed the number Nh(t) of h- 
loops of size h = 3, 4, 5 in the Internet at the Autonomous 
System level and we have identified them as proper mo- 
tifs of the Internet. We have then compared the actual 
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FIG. 4: The clustering coefficients 03(h) and Ci(k) in Inter- 
net (graph(a)) and in the fitness model (graph(b)), the GNG 
model (graph(c)) with p — 0.5 (circles), p = 0.6( triangles) 
and the bosonic network model with j3 = 2.5 (graph(d)). 
Empty (filled) symbols refer to c?,(k) (ci(k)). Graph(a) 
shows data as obtained in November '97 (circles), January 
'99 (squares) and the data taken in January '01 (triangles). 
Solid lines refer to power law fittings, whose exponents are 
reported in table iTTl . 



data with the behavior of Nh{N) in the fitness model, 
in the GNG model and in the Bosonic network, chosen 
as the most accurate Internet model developed to our 
best knowledge. Aside, the generalized clustering coef- 
ficients around individual nodes have been investigated 
as a function of nodes degrees. We have observed that, 
although some qualitative feature of the loop scaling and 
of the clustering coefficient are captured by models, the 
much larger number of cycles observed in the real net- 
work invoke for improvement of the theory. 

The authors are grateful to Uri Alon, Shalev Itzkovitz 
and Yi-Cheng Zhang for useful comments and discus- 
sions. This paper has been financially supported by 



System 


5(3) 


5(4) 


AS 
Fitness 
GNG (p=0.5) 
GNG (p=0.6) 
Bosonic (13 = 2.5) 


0.7 ±0.1 
0.67 ±0.01 
0.32 ±0.02 
0.27 ±0.02 
0.91 ±0.04 


1.1 ±0.01 
0.99 ± 0.01 
1.68 ±0.03 
0.93 ± 0.01 
1.07 ±0.07 



TABLE II: The exponent of the clustering coefficient cz(k) 
and d(k) as measured from Internet data and from simula- 
tions of network models. 
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